Hector (API)

Hector
Original author(s) Ran Tavory
Stable release 0.7.0-23
Preview release none / n/a
Development status Active
Written in Java
Operating system Cross-platform
Available in English
Type Column-oriented DBMS
License MIT License
Website https://github.com/rantav/hector

Hector is a high-level client API for Apache Cassandra. Named after Hector, the builder of Troy in Greek mythology, it is a substitute for the Cassandra Java Client, or Thrift,[1] that is encapsulated by Hector.[2] It also has Maven repository access.[3]

Contents

History

As Cassandra is shipped with the low-level Thrift (protocol), there was a potential to develop a better protocol for application developers. Hector was developed by Ran Tavory as a high-level interface that overlays the shortcomings of Thrift. It is licensed with the MIT License that allows to use, modify, split and change the design.

Over the last couple of days I got the conclusion that the java client I’ve been using so far to speak to cassanrda wasn’t satisfactory. I used the one simply called cassandra-java-client, which is a good start but had some shortcomings I could just not live with (no support for Cassandra v0.5, no JMX and no failover). So I’ve written my own.

Features

The high-level features of Hector are[1]

Type Comment
FAIL_FAST If an error occurs, it fails
ON_FAIL_TRY_ONE_NEXT_AVAILABLE Tries one more host before giving up
ON_FAIL_TRY_ALL_AVAILABLE Tries all available hosts before giving up

Availability metrics

Hector exposes availability counters and statistics through JMX.[5]

Load balancing

Hector follows two load balancing policies with the LoadBalancingPolicy interface. The default is called the LeastActiveBalancingPolicy and routes requests to the pools having the lowest number of active connections, ensuring a good spread of utilisation across the cluster. The RoundRobinBalancingPolicy is a simple round-robin distribution algorithm.[6]

Pooling

The ExhaustedPolicy determines how the underlying client connection pools are controlled. Currently, three options are available:[7]

Type Comment
WHEN_EXHAUSTED_FAIL Fails acquisition when no more clients are available
WHEN_EXHAUSTED_GROW The pool is automatically increased to react to load increases
WHEN_EXHAUSTED_BLOCK Block on acquisition until a client becomes available (the default)

Code examples

As an example, an implementation of a simple distributed hashtable over Cassandra is listed.

 /**
   * Insert a new value keyed by key
   * @param key Key for the value
   * @param value the String value to insert
   */
  public void insert(final String key, final String value) throws Exception {
    execute(new Command(){
      public Void execute(final Keyspace ks) throws Exception {
        ks.insert(key, createColumnPath(COLUMN_NAME), bytes(value));
        return null;
      }
    });
  }
 
  /**
   * Get a string value.
   * @return The string value; null if no value exists for the given key.
   */
  public String get(final String key) throws Exception {
    return execute(new Command(){
      public String execute(final Keyspace ks) throws Exception {
        try {
          return string(ks.getColumn(key, createColumnPath(COLUMN_NAME)).getValue());
        } catch (NotFoundException e) {
          return null;
        }
      }
    });
  }
 
  /**
   * Delete a key from cassandra
   */
  public void delete(final String key) throws Exception {
    execute(new Command(){
      public Void execute(final Keyspace ks) throws Exception {
        ks.remove(key, createColumnPath(COLUMN_NAME));
        return null;
      }
    });
  }

References

  1. ^ a b Ran Tavory. "Hector – a Java Cassandra client". http://prettyprint.me/: PrettyPrint.me. http://prettyprint.me/2010/02/23/hector-a-java-cassandra-client/comment-page-1/. Retrieved 2011-03-23. "Out of the box Cassanra provides a raw thrift client, which is OK, but lacks many features essential to real world clients. I’ve built Hector to fill this gap. Here are the high level features of Hector, currently hosted at github.
    • A high-level object oriented interface to cassandra.
    • Failover support.
    • Connection pooling.
    • JMX support.
    • Support for the Command design pattern to allow clients to concentrate on their business logic and let hector take care of the required plumbing." 
  2. ^ "Hector Client for Apache Cassandra: Encapsulation of Thrift API". http://www.datastax.com/: DataStax. http://www.datastax.com/sites/default/files/hector-v2-client-doc.pdf. Retrieved 2011-04-12. "Hector now completely encapsulates the Thrift API so developers have to deal only with the Hector client using familiar design patterns. The original API is still available for existing users to transition their current projects as well as for those who are comfortable working with Thrift." 
  3. ^ "Hector Client for Apache Cassandra: Fully Mavenized". http://www.datastax.com/: DataStax. http://www.datastax.com/sites/default/files/hector-v2-client-doc.pdf. Retrieved 2011-04-12. "Since the beta release of Cassandra 0.7.0, Riptano has been offering maven repository access for dependecies required for Cassandra usage via Hector." 
  4. ^ Ran Tavory. "Load balancing and improved failover in Hector.". http://prettyprint.me/: PrettyPrint.me. http://prettyprint.me/2010/03/03/load-balancing-and-improved-failover-in-hector/. Retrieved 2011-03-23. "ve added a very simple load balancing feature, as well as improved failover behavior to Hector. Hector is a Java Cassandra client, to read more about it please see my previous post Hector – a Java Cassandra client. In version 0.5.0-6 I added poor-man’s load balancing as well as improved failover behavior." 
  5. ^ "Hector Client for Apache Cassandra: Availability of Metrics". http://www.datastax.com/: DataStax. http://www.datastax.com/sites/default/files/hector-v2-client-doc.pdf. Retrieved 2011-04-12. "To facilitate smoother operations and better awareness of performance characteristics, Hector exposes both availability counters and, optionally, performance statistics through JMX." 
  6. ^ "Hector Client for Apache Cassandra: Basic Load Balancing". http://www.datastax.com/: DataStax. http://www.datastax.com/sites/default/files/hector-v2-client-doc.pdf. Retrieved 2011-04-12. "Hector provides for plugable load balancing through the LoadBalancingPolicy interface. Out of the box, two basic implementations are provided: LeastActiveBalancingPolicy (the default) and RoundRobinBalancingPolicy. LeastActiveBalancingPolicy routes requests to the pools with the lowest number of active connections. This ensures a good spread of utilization across the cluster by sending requests to the machine that has the least number of connections. RoundRobinBalancingPolicy implements a simple round-robin distribution algorithm." 
  7. ^ "Hector Client for Apache Cassandra: Configuration of Pooling". http://www.datastax.com/: DataStax. http://www.datastax.com/sites/default/files/hector-v2-client-doc.pdf. Retrieved 2011-04-12. "The behavior of the underlying pools of client connections can be controlled by the ExhaustedPolicy. […]" 

See also

External links